Previously, we have introduced two operations on vectors: addition and scalar multiplication.
These operations allow us to combine vectors, but we still have no operation for multiplying two vectors together, like .
Put yourself in the position of a mathematician trying to come up with a way to multiply two vectors.
The first question would be: should the product of two vectors be a scalar or a vector?
Turns out, there are actually two ways to multiply two vectors together; one of them results in a scalar, and the other results in a vector.
Today, we will discuss the first of these two operations: the dot product (also known as the scalar product).
info
There is actually a rising field of mathematics called geometric algebra that unifies these two operations into a single operation called the geometric product.
It's super fascinating and powerful, but it's a topic for another day.
Before proceeding with the definition, we will first consider a more intuitive interpretation of the dot product.
Consider two vectors and .
The dot product of them is denoted by .
It outputs a scalar value, which is why it is also known as the scalar product.
To think of the dot product geometrically, do the following:
Place both vectors so that their tails are at the origin.
Project onto .
Multiply the length of the projection by the length of .
The result is the dot product of and .
One way to think about the projection is to physicall hold a pencil and observe its shadow on the table/ground.
As you rotate the pencil, the shadow changes in length.
The length of that shadow is the projection of the pencil onto the table.
In the diagram above, the green arrow represents the projection of onto .
In a word equation, the dot product can be written as:
The length of the projection (let's denote it as ) can be calculated through trigonometry:
And therefore, the dot product can also be calculated as:
The dot product, due to this projection property, is helpful to measure the "similarity" between two vectors.
When and are pointing in the same direction, the dot product is positive.
When they are perpendicular, the dot product is zero, since the projection of one vector onto the other is zero. It's like holding the pencil perpendicular to the table, you get no shadow.
When they are pointing in opposite directions, the dot product is negative.
Additionally, the dot product is associative, meaning that .
This should be a surprising result.
In our geometric interpretation, we projected one specific vector onto another specific vector.
In other words, each vector had a specific role - so why should the dot product have these properties?
Recall that the dot product is defined as (Equation ):
If we switch the order of and , we get:
Hence, the dot product is commutative.
If we scale by and by , we get:
When scaling vectors, their directions do not change, only their magnitudes; hence, remains the same.
With these observations, we can conclude that the dot product is both commutative and associative.
However, this is not the only way to interpret these properties. There is a geometric interpretation as well, and it stems from the symmetry of the dot product.
Consider a case where and have the same magnitude.
Draw a bisecting line between the two vectors so that they have a reflective symmetry:
Recall the geometric interpretation of the dot product:
With this in mind, we can make the following observations:
In this specific case, it should be clear that the projection of onto is the same as the projection of onto .
Then, in this case.
Next, consider scaling into and consider how that changes the dot product.
: If you scale by , the projection of onto will also be scaled by , but remains the same.
Thus, the dot product will be scaled by : .
: If you scale by , the projection of onto will remain the same, but will be scaled by .
Thus, the dot product will be scaled by : .
We picked as an arbitrary scaling factor, but this property holds for any scalar .
Thus: and .
If we scale as well, into , the dot product will be scaled by .
Thus: .
and can be any scalars, and this property holds for any two vectors and .
As such, and can be any two vectors, and our observations hence hold for any two vectors.
With all these observations, we can conclude that the dot product is commutative:
Additionally, we can also conclude based on the above observations that the dot product is associative:
The dot product also satisfies the distributive property. That is, for any three vectors , , and :
There is a beautiful geometric interpretation of this property.
Begin by considering what each side of the above equation represents.
We shall use the notation to denote the projection of onto , and similarly for .
With this in mind, rewrite both sides of the equation as follows:
On the left side, we project the entire sum of and onto , while on the right side, we project and onto separately and then add the results.
Then, we multiply each of these projections by the magnitude of .
To show why these projections are equivalent, consider placing and head-to-tail, and then projecting each of them onto .
Then, draw another vector and project it onto :
In the diagram above, the blue vector represents the sum of and .
As such, the distributive property of the dot product can be understood geometrically as the projection of the sum of two vectors onto a third vector being the same as the sum of the projections of the two vectors onto the third vector.
This definition is great for understanding the geometric interpretation of the dot product, but it is quite tedious to calculate, with the trigonometric function and the magnitudes of the vectors.
There's actually a very simple way to calculate the dot product of the two vectors, which can seem absolutely magical.
Given two vectors and , the dot product can be calculated as:
In other words, the dot product of two vectors is simply the sum of the products of their corresponding components.
While this definition is easier to calculate, it does not have an obvious geometric interpretation.
However, it's a great way to use the dot product in practice, and can also be used to show many properties of the dot product very easily.
In the future, we will show in multiple ways why this definition is equivalent to the geometric definition.
This is a more abstract concept, hence it is placed in the appendix as an optional read.
The dot product is only one way to multiply two vectors together.
There might be other ways to multiply vectors that are not the dot product, especially in spaces that are not just Euclidean.
One such generalization is the inner product; the dot product is simply a specific case of the inner product.
An inner product, denoted by , can refer to any operation that takes two vectors and returns a scalar, satisfying the following properties:
Symmetry: .
(Side note: to be more precise, the inner product is actually conjugate-symmetric in complex spaces, meaning that if you invert the order of the vectors, you get the complex conjugate of the original inner product.)
Linearity in the FIRST argument: .
Positive Definiteness: and if and only if .
A good exercise would be to:
Show that the dot product satisfies these properties.
Think about why these properties would make sense for a product-like operation.
If you have a vector space with an inner product , then is called an inner product space.
Next, we need to discuss something known as a Cauchy sequence.
This is really abstract; it really should be in a separate page, or an Appendix-Appendix, but it's still somewhat related to the dot product.
Just know that it is completely normal to not understand this section.
Suppose you have a sequence of vectors .
If they are getting closer and closer to each other, then they are a Cauchy sequence.
Conceptually, this is similar to the idea of a convergent sequence in calculus.
A Cauchy sequence looks like , where the terms are getting closer and closer to each other.
A non-Cauchy sequence would be something like , where the terms are not getting closer to each other.
Cauchy sequences also include vector sequences. In order to formalize a Cauchy sequence, consider this:
If terms get closer and closer to each other, you would be able to find two terms in the sequence that are arbitrarily close to each other.
So, if you choose any distance, you get to a point in the sequence where all the terms after that point are within that distance of each other.
Let's put it mathematically in a manner similar to how limits are defined in analysis:
A sequence is a Cauchy sequence if, for any , there exists an such that for all , .
Cauchy sequences approach a limit. If you have a set , and every Cauchy sequence in converges to a point in , then is called a complete space.
An example of an incomplete space is the set of rational numbers , where you can have a Cauchy sequence that converges to an irrational number like .
Intuitively, a complete space like is a "continuous" space where you can't find any "holes"/missing points.
There are ways to complete an incomplete space (similar to how you can remove discontinuities in a function with a limit), but that's a topic for another day.
Let's put what we have learned together. If you have a vector space with an inner product , and is a complete space (every Cauchy sequence converges to a value inside ), then is called a Hilbert space.
The basis of quantum mechanics is the Hilbert space, where vectors are wavefunctions and the inner product is the probability amplitude.